# Financial Document Processing

Typhoon Ocr 7b
A vision-language model specifically designed for Thai-English real-world document parsing, based on the Qwen2.5-VL-Instruction framework
Image-to-Text Transformers Supports Multiple Languages
T
scb10x
126
9
Qwen2 VL 2B OCR
Apache-2.0
Qwen2-VL-2B-OCR is an OCR model fine-tuned based on unsloth/Qwen2-VL-2B-Instruct, specializing in extracting complete text from documents, tables, and payroll images.
Image-to-Text Transformers English
Q
JackChew
842
4
OCR TextInput Base
A specialized image-to-text model for the financial domain, supporting English text recognition, primarily used for processing image content in financial documents.
Text Recognition Transformers English
O
rohit5895
31
0
Donut Base Finetuned Cord V2
Donut is a visual document understanding model based on Swin Transformer, specifically fine-tuned for the CORD dataset, capable of extracting structured text information from images.
Image-to-Text Transformers
D
Xenova
32
0
Tatr Tab Struct V2
DETR architecture model trained on PubTables1M and FinTabNet datasets, specialized for table structure recognition tasks
Text Recognition Transformers
T
deepdoctection
99
2
Donut Base Finetuned Invoices
Multilingual invoice processing model optimized based on Donut architecture, capable of extracting key invoice fields
Image-to-Text Transformers
D
to-be
823
21
Layout Xlm Base Finetuned With DocLayNet Base At Linelevel Ml384
MIT
A line-level document understanding model fine-tuned on the DocLayNet dataset based on the LayoutXLM base model, supporting multilingual document layout analysis and token classification.
Text Recognition Transformers Supports Multiple Languages
L
pierreguillou
103
3
Lilt Xlm Roberta Base Finetuned With DocLayNet Base At Paragraphlevel Ml512
MIT
This is a document understanding model specifically designed for analyzing document layout and content, performing token classification tasks at the paragraph level.
Text Recognition Transformers Supports Multiple Languages
L
pierreguillou
126
3
Lilt Xlm Roberta Base Finetuned With DocLayNet Base At Linelevel Ml384
MIT
A line-level document understanding model fine-tuned based on LiLT and DocLayNet dataset, supporting multilingual document layout analysis
Image-to-Text Transformers Supports Multiple Languages
L
pierreguillou
700
12
Receipt Paper Invoice Document
This is an image classification model generated based on HuggingPics, specifically designed to recognize and classify images into four categories: receipts, paper, invoices, and documents.
Image Classification Transformers
R
mustafamurat
17
0
Donut Demo
MIT
This is a Donut model fine-tuned on the CORD-v2 dataset, designed for image-to-text tasks, achieving an average accuracy of 0.901.
Image-to-Text Transformers
D
katanaml
24
3
OCR LayoutLMv3 Invoice
An invoice recognition model fine-tuned based on LayoutLMv3-base, trained on the wild_receipt dataset, excelling in extracting structured information from invoices.
Sequence Labeling Transformers
O
jinhybr
340
8
Vit Receipts Classifier
Apache-2.0
A binary classification model based on ViT architecture for identifying whether an image is a receipt/invoice
Image Classification Transformers
V
jjmcarrascosa
75
2
Layoutlm Document Qa
MIT
This is a fine-tuned multimodal LayoutLM model for document question answering tasks, capable of understanding both text and layout information in documents to answer questions.
Text-to-Image Transformers English
L
impira
26.10k
1,102
Layoutlmv3 Cord Ner
A document understanding model fine-tuned based on LayoutLMv3-base, specifically designed for named entity recognition tasks on the CORD dataset
Text Recognition Transformers
L
renjithks
26
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase